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ABSTRACT 

A RI S is an artificial intelligence system which uses 
the English language tc learn, understand, and communicate. The 
system attempts to simulate the psychoneurological processes which 
enable man tc communicate verbally. It uses a modified 
st ra tif icational grammar model and is being programed in PL/1 (a 
programing language) for an IBM 360/67 computer. In its present state 
of development, ARTS uses a crude simulator of verbal communication 
similar to Weizenbaum’s ELIZA program. From this base an attempt will 
be made to develop a concept network having human-like 
characteristics. The stratification model will be extended to the 
concept strata, using Piaget 1 s developmental theories regarding the 
dynamic nature of knowledge. The two necessary characteristics for 
the structure of kncwledge--hierarchical and dynamic relations — will 
then be the natural consequences of the resulting network* (Author/JY) 
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ASPECTS OF A NATURAL LANGUAGE BASED ARTIFICIAL INTELLIGENCE SYSTEM 



REPORT # 7 
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George A. Borden 
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The Pennsylvania State University 
University Park, Pa. 16802 

ABSTRACT 



ARIS is an artificial intelligence system which uses the English 
language to learn, understand, and communicate. Based on present psychoneuro- 
logical theories, it attempts to simulate the psychoneurological processes man 
has which enables him to communicate. It uses a modified stratificational 
grammar model and is being programmed in PL/1 for an IBM 360/67. This paper 
outlines its present state comparing it with Weizenbaum f s ELIZA program, and 
speaks to the problems of developing a concept network having human like char- 
acteristics. This is accomplished by extending the stratificational model to 
the concept strata. The two necessary characteristics for the structure of 
knowledge, i. e. , hierarchical and dynamic relations, are then natural consequences 
of the resulting network. 
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ASPECTS OF A NATURAL LANGUAGE BASED ARTIFICIAL INTELLIGENCE SYSTEM 

Report #7 

LANGUAGE AND THE STRUCTURE OF KNOWLEDGE 

ARIS is a natural Language Based Artificial Intelligence System. It is 
characterized by its ability to learn, understand, and communicate through 
the use of the English language. It uses written language as the medium 
for communicating, and the word as its smallest unit of analysis (all 
punctuation marks are treated as words). Dr. William Nelson, of the State 
University of New York at Gneanta, and I hare been developing the theoretical 
bases for this system for the past two years. We are now to the stage where 
we are beginning programming of the system. 

Our approach is rather elementary. We are beginning with a crude 
simulator of verbal communication similar to Weizenbaum's ELIZA. (1) The 
main objective of this system is to simulate verbal output. It does this 
at the expense of gross over simplifications of the internal processes of 
decoding, information processing, and encoding. It does not have the ability 
to learn, and thus, is totally inadequate as an artificial intelligence system. 
The progress of the project will be measured by the degree to which the many 
internal processes involved in human verbal communication are simulated. 

The basic model for ARIS may be schematized as follows: 




INFORMATION 
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Input is no problem. It consists of typed English sentences from an 
on-line terminal. Decoding presents few problems as well. It is accomplished 
by determining the boundaries of each word and locating the word in ARIS' 
lexicon. We are using a HASH code which uniquely identifies each word and 
computes its location in memory. Whether the word is found or not is noted 
and the processing moves on to the information processing unit. Now the fun 
begins. If the word is not found in the lexicon it must be added. This is 
not so difficult once we understand the information format for a lexical entry. 
If the word is found the information processing unit begins to establish the 
boundaries of the semantic space mapped by the verbal stream being processed. 
When the processing is complete the output decision is made. If a reply is 
to be made, the encoding process is instigated. (This too, of*. »cme sticky 
problems which I shall not consider in this paper.) As the message is encoded 
into a verbal signal it is typed out on the same terminal as the input was 
entered. 

The above quick run-through is offered only as a brief introduction to 
the total system. I'm sure you all have questions about how we plan to solve 
the many problems in each phase of the total process. We have solved some of 
them and have a few ideas about how we can solve many of those that remain. 
Since we are attempting to simulate the human processes involved in verbal 
communication as closely as possible we lean heavily upon existing theories 
about these processes. I would like to speak now about one particular 
problem we have encountered in the information processing phase of this 
system. 
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The decoding and encoding processes in the ARIS system are built 
around a modified stratificational grammar for written English*. 

Stratificational grammar is a natural for this system since it is based 
on the assumption that language is a network of relations in which one can 
proceed from sound to meaning or from meaning to sound using the same 
network**. This relational network can be programmed for computer 
manipulation using the word as the basic unit of analysis rather than 
the phoneme. When this is done the mechanics for the encoding-decoding 
processes exist and decoding can be accomplished quite easily though 
encoding is a bit more difficult. The encoding difficulty lies at the 
level of message generation which is at least one strata above the sememic 
strata. This is also the point where the information processing phase is 
least understood. 

In the decoding process the input stream is analyzed into its several 
linguistic components called sememes. At this point we have done all we can 
do with linguistics, but we still do not have an understanding of what the 
verbal input said except in a linguistic sense. What lies beyond this stage? 
Roger Schank has suggested the Conceptual Dependency theory as the next step 
toward meaning. (2) This fits in well with stratificational grammar but 
does not give us the mechanism we want. Therefore we have decided to modif> 
this approach also. 

In developing my approach to artificial intelligence I have tried to keep 
in mind the idea of Piaget (3) that "Knowledge is neither solely in the subject, 
nor in a supposedly independent object, but is constructed by the subject as 
an indissociable subject-object relation." To establish this relation we must 

*Dr. James Copeland, Rice University, is working on these phases of the system. 

**"Linguistic Cues to the Workings of the Mind", a lecture by Sydney Lamb given 
O at the Pennsylvania State University, November 16, 1970. 
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interact with the object. This appears to be saying that when a brain is 
stimulated with an external signal the resulting neurological trace 
constitutes the relation of subject to object. Knowledge then, becomes 
a network of relationships, with each node being a concept. The evaluation 
of these concepts in the mind of a receiver may be accomplished by triggering 
any number of paths leading into this node. The outside signal may, of 
course, be either verbal or nonverbal. My problems are considerably reduced 
since I am working only with verbal signals. This means that I have the 
problem of tying together two relational networks --the one for language and 
the one for concepts, 

Piaget's developmental theories also make the point that knowledge is 
not a stated quality but a dynamic relation. One is only aware of his knowledge 
when it is brought forth, which means, when the nodes constituting that piece 
of knowledge, or subtending that area of semantic space, are activated. Which 
nodes are to be activated is dependent upon the subject area being discussed 
and the focus of the discussion*. It may be assumed that this is also the 
way new knowledge is acquired. As information is processed new connections 
are activated between nodes in the network. This may mean that the internal 
signal must pass through several other nodes to activate the primary concept 
sought. When this is the case, the primary concept becomes more abstract 
and/or less definable. The connotative meaning is traceable to the periferal 
concepts stimulated in the process of activating the primary concept. 

*It is at this point that theories concerning "attitudinal frame of reference, 
(Rokeach) and psychological set or one's bias enter. Space does not allow 
a discussion of these forces at this time. It should be obvious that one's 
view of reality will have a definite affect on the way messages are created 
both from the decoding process and for the encoding process. Dr. William 
Nelson (SUNY at Oneanta) is working on this problem in our project. 
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Though knowledge is only realizable through stimulation of dynamic 
relations, it should be obvious that, if this network were to be forzen, one 
could construct a hierarchical structure of the concept contained in the 
system. Such a structure would have ambiguities where the connecting link* 
between three or more concepts are circuitous, i.e., when it depends upon 
where you start as to which concept is subordinate to which. Since the 
English language has a high degree of redundancy and any degree of precision 
is often lacking in our attempts to define various concepts, we see how rapidly 
this hierarchical structure dissipates into a myrad of paths leading everywhere 
Yet we can build such a hierarchy of concepts, when pressed, indicating that 
this is a necessary part of the simulation model for verbal communication. (4) 

ILLUSTRATION 

The hierarchy of concepts may be distinguished in several ways. All of 
these hierarchies are interconnected and influence the meaning we g' 1 '* 0 to any 
particular concept. We may talk about the affective concepts which culminate 
in the three major dimensions of Osgood (5) (evoluative, activity and potency) 
Each concept we encounter has associated with it varuous aspects of these 
dimensions. Concepts may also be ordered according to their subject area, 
or by categories such as Roget's ^ saurus uses, or as the rhetoricians have 
presented, (6) It should be noted that these hierarchical structures are not 
super- imposed upon the concepts contained in the knowledge bank but rather 
evolve from the relations developed in the concept structures. In this way 
it follows that concepts from many different hierarchical structures may be 
associated with any given concept, and depending upon what focus you choose 
around which to structure your hierarchy, you find that other concepts become 
subordinate to the concepts found in your main structure. It is this 
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IL LUSTRATION 




All connecting lines are two way. Therefore, one can start at any node 
and progress to any other node. There are also many different routes by which 
this process can be accomplished. Given this type of network, it follows that 
the hierarchy of concepts depends on where one starts. 
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mechanism which allows us to start at any concept and proceed to any other, 
establishing new hierarchies as we search our memory. 

The association of concepts in our memory plays a major role, in how 
we decode and/or encode verbal signals. The fact that many of us have automatic 
(learned) triggering of a given affective concept (say, the concept represented 
by Bad, Evil Sin, etc.) when we decode various, highly diversified substantive 
concepts such as adultery, abortion, smoking, liquor, communism, etc., is an 
example of the multiple associations established among some concepts. Any 
time one of the substantive concepts is decoded the affective concept may also 
be triggered and, if so, becomes part of the meaning generated by the input 
signal. If this automatic triggering is strong enough it may block the decoding 
of the rest of the input signal. Automatic triggering will be handled in 
ARIS by computing a probability factor from the frequency of association of 
any two concepts in the concept structure (the neurological counterparts for this 
may be contained in the production of RNA and its associates) . Thus the 
automatic association of two concepts may change with usage (Learning?). 
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EXAMPLE I 
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When a sentence is decoded, how do we know what has been said? As the 
verbal stream is processed, concepts are triggered. These usually remain 
open, that is, the many paths leading from them to concepts are not closed 
until the input is complete. Closure is then instigated. The resulting 
ring structure is the meaning we get from the input. The activation of this 
ring structure (meaning) is remembered through the development of the 
automatic triggering probability. Inherent in what I have already said 
is the fact that the focus one is operating under while decoding has a 
definite affect on the automatic triggering of an associated concept when 
the target concept has been decoded. 

Let's work through an example to see how ARIS will function. Given 
the input sentence "I HEAR LOS ANGELES IS SINKING." and skipping over 
much of the linguistic decoding process we might have the following; 



The resulting ring structure, depending on the focus of the decoding, 
might be 



EXAMPLE II 



where concepts A-N might be represented verbally by 



A= Positive affection 
B= Honesty 
O Egotism 
D~ Understand 
E= Know 
F= Think 

G= Negative affection 



H= SMOG 
I- Pretty 
J= Familiarity 
K= Active 
L= Presently 
M= Destruction 
N= Fear 
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EXAMPLE II 
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Because concepts 1 and 2 trigger off positive subconcepts A and E the 
resulting positive statement (concept 3, 4, and 5) develop a stronger 
probability of automatic triggering if any one of them is decoded from 
verbal input or encoded for verbal output. In this way the knowledge bank 
is developed and various hierarchical structures emerge. 

The complexity of this system should be apparent to all. The simple 
sentence used in the above example could have many more possible results 
than the one given. If the first person l_ is not known, the whole sentence 
might be discarded with only minimal strengthening between the concepts 
represented by LOS ANGELES and SINKING. On the other hand if a strong relation 
exists between these two concepts already this will help to establish the 
integrity of the speaker. Thus the speakers identity must be known to the 
system. Identification of speakers for ARIS is provided at sign on time. 

ARIS, like humans, relates each user to a hierarchy of concepts represented 
verbally by trusted friend, friend, acquaintance, etc. Thus if the speaker 
uses the third person, "Jack said that Los Angeles is sinking" the evoked 
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ring structure would involve ARIS' feelings about Jack. If it knows no 

Jacks, then its immediate reply would be "Who is Jack?" since it could not 
complete closure. 

One cannot possibly keep track of all the steps taking place in a 
system such as this. To try and do so is self defeating. Therefore the 
system is composed of algoriths which compute automatic triggering 
probabilities, store appropriate data, build lexical entries, etc. Since 
the whole system is a network of relations forming hierarchical strata, much 
of the bookkeeping of other systems is unnecessary. It is impossible to 
discuss many of the problems and the solutions to these problems in a paper 
of this length. Further, many of the problems that are known have not 
been solved yet, nor are we fully award of all the problems we shall encounter. 
However, we do have a few answeres, a few theories, and a few ideas about how 
the whole system can be developed, with a little luck and a lot of hardwork 
I think ARIS will become a reality. 
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